Verification of Effective Retrieval Method for Anchor Text on Navigational Retrieval

نویسندگان

  • Kenji Tateishi
  • Hideki Kawai
  • Dai Kusui
  • Toshikazu Fukushima
چکیده

We participated in NTCIR-5 WEB Navigational Retrieval Subtask(Navi-2) in order to verify the most effective retrieval method for the index of anchor texts by using a retrieval system that indexed only anchor texts instead of full texts of Web pages. We introduced retrieval methods that combine one or more of six retrieval measures: (a) anchor frequency (af), (b) reference consistency (rc), (c) query weight (qw), (d) page representativeness (rep), (e) site relevancy (sr), and (f) inverse anchor document frequency (iadf). The experimental results revealed that: (1) it could be implied that the retrieval method that used only anchor frequency for the index of anchor texts was more effective than the retrieval method for the index of only full texts of Web pages, and that (2) the retrieval method that contained sr or iadf was effective for the index of anchor texts, and that sr was more effective than iadf. Keyword: Anchor Text, Navigational Retrieval, Site Relevancy

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploiting Anchor Text for the Navigational Web Retrieval at NTCIR-5

In the Navigational Retrieval Subtask 2 (Navi-2) at the NTCIR-5 WEB Task, a hypothetical user knows a specific item (e.g., a product, company, and person) and requires to find one or more representative Web pages related to the item. This paper describes our system participated in the Navi-2 subtask and reports the evaluation results of our system. Our system uses three types of information obt...

متن کامل

Using Text Surrounding Method to Enhance Retrieval of Online Images by Google Search Engine

Purpose: the current research aimed to compare the effectiveness of various tags and codes for retrieving images from the Google. Design/methodology: selected images with different characteristics in a registered domain were carefully studied. The exception was that special conceptual features have been apportioned for each group of images separately. In this regard, each group image surr...

متن کامل

TREC-10 Web Track Experiments at MSRA

In TREC-10, Microsoft Research Asia (MSRA) participated in the Web track (ad hoc retrieval task and homepage finding task). The latest version of the Okapi system (Windows 2000 version) was used. We focused on the developing of content-based retrieval and linkbased retrieval, and investigated the suitable combination of the two. For content-based retrieval, we examined the problems of weighting...

متن کامل

Image retrieval using the combination of text-based and content-based algorithms

Image retrieval is an important research field which has received great attention in the last decades. In this paper, we present an approach for the image retrieval based on the combination of text-based and content-based features. For text-based features, keywords and for content-based features, color and texture features have been used. Query in this system contains some keywords and an input...

متن کامل

Evaluation of Web Retrieval Methods Using Anchor Text

In this paper, we evaluate two types of anchor texts: a page anchor and a site anchor. Since the anchor text tends to summarize information referred ahead, it can be expected that the terms appearing there have important meaning in information retrieval. We introduce a retrieval method to give high priority to the terms in the anchor text. In the experiment, we compared the proposed method with...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005